Cancer screening simulation models: a state of the art review

Background Nowadays, various simulation approaches for evaluation and decision making in cancer screening can be found in the literature. This paper presents an overview of approaches used to assess screening programs for breast, lung, colorectal, prostate, and cervical cancers. Our main objectives are to describe methodological approaches and trends for different cancer sites and study populations, and to evaluate quality of cancer screening simulation studies. Methods A systematic literature search was performed in Medline, Web of Science, and Scopus databases. The search time frame was limited to 1999–2018 and 7101 studies were found. Of them, 621 studies met inclusion criteria, and 587 full-texts were retrieved, with 300 of the studies chosen for analysis. Finally, 263 full texts were used in the analysis (37 were excluded during the analysis). A descriptive and trend analysis of models was performed using a checklist created for the study. Results Currently, the most common methodological approaches in modeling cancer screening were individual-level Markov models (34% of the publications) and cohort-level Markov models (41%). The most commonly evaluated cancer types were breast (25%) and colorectal (24%) cancer. Studies on cervical cancer evaluated screening and vaccination (18%) or screening only (13%). Most studies have been conducted for North American (42%) and European (39%) populations. The number of studies with high quality scores increased over time. Conclusions Our findings suggest that future directions for cancer screening modelling include individual-level Markov models complemented by screening trial data, and further effort in model validation and data openness. Supplementary Information The online version contains supplementary material available at 10.1186/s12911-021-01713-5.


WOS and Scopus search remarks
WOS and Scopus do not deal with "Mesh Terms", thus, all this terms and their synonyms were searched in titles, abstracts and keywords or excluded if they were already included for general search in query.

NLP record filtering
Procedure: 1. At first 470 records were checked manually to create learning sample: • Abstract • Title • Keywords • Authors (126 of them were manually evaluated as eligible.) 2. For every record: (a) All text information was combines in one text and stemmed (b) Term Frequency vectorization was applied (c) Most frequent and less frequent terms were excluded from vectorization. Parameters of boundaries were optimized to get best result on test sample.

Appendix C. Form for manual abstracts inspection
Keywords are automatically highlighted in the abstracts and some fields are also automatically filled.

Form fields description
Field Name Field Description Valid Expert evaluation of article abstract suitability Year Year of publication Structure Short description of the model exploited in the paper Cancer Cancer type(s) observed in the paper Outputs Measured (calculated) values that presented as the paper scientific results Req Keywords that determine the approach of the described research Modalities Modality of the research and simulation (early detection, prevention, therapy) 4 Appendix D. Full-text paper evaluation checklist 4.1 Motivation for quality criteria We have used a Full-text paper evaluation checklist to assess the quality of studies. Our checklist was based on the general criteria commonly used in systematic reviews. However, the features evaluated in systematic reviews are highly dependent on the nature of the research question. Also, they have to be focused on the specific studied outcomes and everything else must be filtered. We were unable to find any previously developed quality assessment tools that would be applicable for our assessment which is not bound to the specific outcomes.
The NOS was developed for empirical studies with individual level data and it cannot be applied in our paper. The items of the scale are simply not applicable. Cochrane RoB and GRADE as well as ROBANS, ORBIT and AHRQ are also reasonably similar. The closest one, due to our knowledge, was PROBAST but is also intended for assessing studies aimed at developing models for clinical use based on empirical data, i.e. not similar to our scope.
We have to admit that our assessment method has not been validated or created by means of a "delphi" method or other highly elaborated procedures. Nevertheless, we can claim that the two aspects of analysis used as quality indicators (validation and sensitivity analysis), and two items pertaining to reporting (appropriateness and limitations), reflect important features in studies, and represent similar constructs (quality of the study conduct and reporting) as in well-established quality assessment tools. We have also added discussion on this aspect in the limitations of the study.

Automatic part
Automatic data was obtained though the https://eutils.ncbi.nlm.nih.gov/entrez/ service and from the parsed paper pdfs. Keywords are counted and number of terms related to specific cancer type compared to the threshold. If the threshold is exceed, it's considered that the paper is about this cancer type.

Manual part
Windows form was used to extract specific data, which can't be extracted automatically or must be checked

Form fields description
Field Name Field Description Paper is relevant Expert evaluation of the article suitability Model name Keywords that shortly describe simulation approach Model was applied Main result of the paper obtained from model simulation Model was developed Main result of the paper is the model itself. Model described in details so it can be reproduced from this paper. Sensitivity analysis Sensitivity analysis and its results are discussed in the paper Limitations discussed briefly There are several sentences in the paper about model limitations Limitations discussed in details There is highlighted section in which limitations are discussed.

CI and Std
Were the results in the paper were presented in the appropriate format? Validation Were the results of model validation presented in the paper? Overall mark (0-5) Expert opinion on the paper consistency

Expert Mark vs. Automatic Mark
The papers are assessed not only by experts. Paper quality is also assessed by algorithm which used less subjective assessment. The results of assessments are combined. Here are the description of the automatic assessment procedure.
1. If model was APPLIED and DEVELOPED in paper: (a) +2 model was validated (b) +1 there were sensitivity analysis (c) +1 Limitations were discussed (d) +1 Results were presented in the appropriate format 2. If model was just DEVELOPED in paper: (a) +2 model was validated (b) +1 Limitations were discussed (c) +2 Results were presented in the appropriate format !!! screening assessments are not that important if main result is the model itself.
3. If model was just APPLIED in paper: (a) +2 there were sensitivity analysis (b) +1 Limitations were discussed (c) +2 Results were presented in the appropriate format !!! model is considered validated.

Models naming
Authors can give unique names to their models. We've tried to unify model names using keywords which was mentioned in the descriptions of the models. This is our renaming